BRIDGE: Byzantine-Resilient Decentralized Gradient Descent
نویسندگان
چکیده
Machine learning has begun to play a central role in many applications. A multitude of these applications typically also involve datasets that are distributed across multiple computing devices/machines due either design constraints (e.g., multi-agent and Internet-of-Things systems) or computational/privacy reasons large-scale machine on smartphone data). Such often require the tasks be carried out decentralized fashion, which there is no server directly connected all nodes. In real-world settings, nodes prone undetected failures malfunctioning equipment, cyberattacks, etc., likely crash non-robust algorithms. The focus this paper robustification presence have undergone Byzantine failures. failure model allows faulty arbitrarily deviate from their intended behaviors, thereby ensuring designs most robust But study resilience within learning, contrast still its infancy. particular, existing Byzantine-resilient methods do not scale well models, they lack statistical convergence guarantees help characterize generalization errors. paper, scalable, framework termed Byzantine-resilient decentralized gradient descent (BRIDGE) introduced. Algorithmic for one variant BRIDGE provided both strongly convex problems class nonconvex problems. addition, experiments used establish scalable it delivers competitive results learning.
منابع مشابه
ByRDiE: Byzantine-resilient distributed coordinate descent for decentralized learning
Distributed machine learning algorithms enable processing of datasets that are distributed over a network without gathering the data at a centralized location. While efficient distributed algorithms have been developed under the assumption of faultless networks, failures that can render these algorithms nonfunctional indeed happen in the real world. This paper focuses on the problem of Byzantin...
متن کاملByzantine Stochastic Gradient Descent
This paper studies the problem of distributed stochastic optimization in an adversarial setting where, out of the m machines which allegedly compute stochastic gradients every iteration, an α-fraction are Byzantine, and can behave arbitrarily and adversarially. Our main result is a variant of stochastic gradient descent (SGD) which finds ε-approximate minimizers of convex functions in T = Õ ( 1...
متن کاملOn Nonconvex Decentralized Gradient Descent
Consensus optimization has received considerable attention in recent years. A number of decentralized algorithms have been proposed for convex consensus optimization. However, on consensus optimization with nonconvex objective functions, our understanding to the behavior of these algorithms is limited. When we lose convexity, we cannot hope for obtaining globally optimal solutions (though we st...
متن کاملAsynchronous Decentralized Parallel Stochastic Gradient Descent
Recent work shows that decentralized parallel stochastic gradient decent (D-PSGD) can outperform its centralized counterpart both theoretically and practically. While asynchronous parallelism is a powerful technology to improve the efficiency of parallelism in distributed machine learning platforms and has been widely used in many popular machine learning softwares and solvers based on centrali...
متن کاملOn the Convergence of Decentralized Gradient Descent
Consider the consensus problem of minimizing f(x) = ∑n i=1 fi(x) where each fi is only known to one individual agent i belonging to a connected network of n agents. All the agents shall collaboratively solve this problem and obtain the solution via data exchanges only between neighboring agents. Such algorithms avoid the need of a fusion center, offer better network load balance, and improve da...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Signal and Information Processing over Networks
سال: 2022
ISSN: ['2373-776X', '2373-7778']
DOI: https://doi.org/10.1109/tsipn.2022.3188456